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Abstract — The problem of Dirty Paper Coding (DPC) 
over the Fading Dirty Paper Channel (FDPC) Y = H(X + 
S) + Z, a more general version of Costa's channel, is studied 
for the case in which there is partial and perfect knowledge 
of the fading process H at the transmitter (CSIT) and the 
receiver (CSIR), respectively. A key step in this problem 
is to determine the optimal inflation factor (under Costa's 
choice of auxiliary random variable) when there is only par- 
tial CSIT. Towards this end, two iterative numerical algo- 
rithms are proposed. Both of these algorithms are seen to 
yield a good choice for the inflation factor. Finally, the high- 
SNR (signal-to-noise ratio) behavior of the achievable rate 
over the FDPC is dealt with. It is proved that FDPC (with t 
transmit and r receive antennas) achieves the largest possi- 
ble scaling factor of min(i, r) logSNR even with no CSIT. Fur- 
thermore, in the high SNR regime, the optimality of Costa's 
choice of auxiliary random variable is established even when 
there is partial (or no) CSIT in the special case of FDPC 
with t < r. Using the high-SNR scaling-law result of the 
FDPC (mentioned before), it is shown that a DPC-based 
multi-user transmission strategy, unlike other beamforming- 
based multi-user strategies, can achieve a single-user sum- 
rate scaling factor over the multiple-input multiple-output 
Gaussian Broadcast Channel with partial (or no) CSIT. 

Index Terms — Auxiliary random variable, Dirty Paper 
Coding, Inflation factor. 

I. Introduction 

IN this paper, we study a more general version of Costa's 
(original) Dirty Paper Coding (DPC) problem wherein 
present a fading process, i.e., the problem of DPC over 
the channel of the form Y = H(X + S) + Z, which we call 
the Fading Dirty Paper Channel (FDPC). We study this 
problem for the case in which there is partial and per- 
fect knowledge of the fading process H at the transmitter 
(CSIT) and at the receiver (CSIR), respectively. Before 
continuing with the problem at hand, we first explain the 
original DPC problem studied by Costa. 

Costa's work [T] is based on the capacity formula of 
Gelfand and Pinsker. They proved that the capacity of 
a discrete memoryless channel p(y\x,s) with side informa- 
tion S known non-causally at the transmitter but not at 
the receiver is given by 

C= max I(U;Y) - I(U; S), (1) 

where U is a finite-alphabet auxiliary random variable 
(RV) [2]. Costa used this formula for finding the capacity 
of the channel Y — X + S + Z, where X is the transmitted 
signal with power constraint E(X 2 ) < P; interference S is 
a zero-mean, variance Q Guassian RV (S ~ A/"(0,Q)) and 
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is assumed to be known non-causally at the transmitter 
but not at the receiver; and Z ~ A/(0, Q) is the additive 
noise. With the Gaussian input distribution and the choice 
of auxiliary RV as U = X + aS, where X and S are inde- 
pendent and the parameter a is the inflation factor whose 
optimal value was found to be P + N , Costa proved that 
the interference S does not result in any loss of capacity or 
C = \ log(l + -£). Costa named this technique of canceling 
the known interference as Dirty Paper Coding (DPC). 

The problem of current interest, i.e., the application of 
DPC to FDPC with partial CSIT is of practical impor- 
tance from the point of view of studying the performance 
of DPC over the multiple-input multiple-output (MIMO) 
Gaussian Broadcast Channel (BC) with partial CSIT. The 
most challenging part of this problem is to find the optimal 
inflation factor. Though this problem has been considered 
before, the existing solutions are not satisfactory. In [3], 
Bennatan et al. suggest a numerical approach which in- 
volves an exhaustive search over a set which is arbitrarily 
restricted to inflation factors that are optimal under per- 
fect CSIT. There is no reason for the inflation factor op- 
timal under partial CSIT to belong to this set. Moreover, 
such an exhaustive search can be impractical or even im- 
possible to implement. In [4], Zhang et al. suggest the use 
of Costa's inflation factor (i.e., a — P + N ) over the SISO 
FDPC, as well. This choice is clearly not optimal. Lastly 
the paper jjjj by Piantanida et al. studies only a very spe- 
cific setting of SISO FDPC and therefore lacks generality. 
Thus the important problem of determination of inflation 
factor still remains unresolved. 

In this paper, we develop two algorithms for finding the 
inflation factor under partial CSIT. These algorithms yield 
really good results. Then, the paper deals with the high- 
SNR (signal-to-noise ratio) analysis of the FDPC and some 
key results are proved on this front. 

Notation: An upper-case letter (e.g., X) denotes a RV 
while the corresponding lower-case letter (e.g., x) denotes 
its realization. Ex{-) denotes expectation over RV X. For 
matrix A, \A\ denotes its determinant while tr(A) is its 
trace, and its complex-conjugate transpose is A*. I de- 
notes the identity matrix. 

II. Channel Model 
A t x r FDPC is given by 

Y = H(X + S) + Z. (2) 

Here, the transmitted signal A is a complex normal ran- 
dom vector with mean and covariance matrix Ex (i.e., 
A ~ £A/"(0,£x)) an d has a power constraint of tr(Ev) < 
P; S ~ CA/(0,Es) is the interference known non-causally 
at the transmitter but not at the receiver; Z ~ CA/(0,£z) 
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is the additive noise; and X, S, and Z are assumed to be 
independent. Channel fading matrix H is assumed to be 
known perfectly at the receiver whereas we let H to be the 
transmitter's estimate of H . Further assume that H is full 
rank with probability 1 and |£x|,|5].z| > 0. Let |£g| = Q, 
l^zl = N , and define signal-to- noise ratio (SNR) to be ^. 
We choose the auxiliary RV as U = X + WS (Costa's choice 
extended to the MIMO FDPC), where the t x t matrix, W, 
is the inflation factor. 

Using the capacity formula of [6] which is a generaliza- 
tion of |T]), we derive the achievable rate over the FDPC 
with partial CSIT as 



(3) 



R = E fl \ E mfl log{ \L X \ \H(E X + £ 5 )IT + E z \}- 



min^^log 



Z X + WT, S W* {Y,x + WT,s)H* 
H(Z X + Z S W*) H(Z X + Z S )H* + Z Z 



Minimization over W in the second term above is precisely 
the problem of determination of inflation factor. We define 
the no-interference upper-bound on the achievable rate as 
the rate achievable over the FDPC in absence of interfer- 
ence (i.e., when Q = 0). 

Under perfect CSIT, it is well established that the choice 
U = X + WS is optimal and the optimal inflation factor is 
given by W opt = H X H*{HY, X H* + Z Z )~ 1 H. However, un- 
like the perfect-CSIT case, under partial CSIT, the choice 
U = X + WS is not known to be optimal. Further, even if 
this choice of auxiliary RV is assumed, the problem of find- 
ing a closed-form solution for the optimal inflation factor 
appears intractable. 

III. Determination of Inflation Factor 

We first consider the case of SISO FDPC separately and 
then move on to the general MIMO case. 

A. SISO FDPC (t = r =1) 

In the case of SISO FDPC (for which the inflation factor 
is scalar), it is possible to generalize the known perfect- 
CSIT result to the partial-CSIT case. 

Proposition 1: For the SISO FDPC, the optimal infla- 
tion factor W opt & [0,1], irrespective of H and H. 

Proof: In the SISO case, the problem reduces to 



mmE H ^\og{\H\*PQ\l-W\ 



\W\ 2 QN + PN}. 



It can be seen that W op t must be real. Also, for any value of 
H = h, function f(W) = \h\ 2 PQ\l - W\ 2 + \W\ 2 QN + PN 
is quadratic in (real) W, and W minimizing f(W) lies in 
the interval [0, 1] for any h. ■ 
This proposition helps in numerical determination of W op t- 
The result of this proposition is quite surprising because 
such a result can not be proved if the fading coefficients 
multiplying X and S are different. 

B. MIMO FDPC 

A result analogous to Proposition [1] can not be proved 
in the MIMO case. Further, the minimization problem 



in ([3]) is a non-convex optimization problem. Also, the re- 
quired conditional expectations are analytically intractable 
for any general H and H . This makes the problem difficult 
in the case of MIMO FDPC. We now propose two subop- 
timal algorithms for finding the inflation factor. The main 
advantage of our algorithms is that these can be used over 
any general FDPC, irrespective of the distribution of H 
and the nature of partial CSIT H . 

Algorithm 1: The key idea behind this algorithm is 
to minimize the objective function stepwise, i.e., at each 
step, we minimize over one row of W while treating all 
other rows as constants. It turns out that the objective 
function when regarded as a function of only one row has 
a form that is amenable to analytical closed-form mini- 
mization if we upper-bound it by moving the expectation 
inside the logarithm. Thus, at every iteration, minimiza- 
tion over each row of W is carried out successively, and 
these iterations are repeated until a good choice is found. 

Let us now consider the minimization over the first 
row of W while treating all other rows of it as constants 
0. Observe that only the first row and the first col- 
umn of the block-partitioned matrix in ([3]), which we call 
M, depend on W\. Thus, we repartition M, i.e., write 

" B where a = S x „ + W{£,s{Wi)* is M lu 



M = 



B D 



B* = [S Xi + W^siW*) 1 (E Xl +W 1 Z S )H*], and D is a 
(t + r— 1) x (t+r — 1) square matrix remained after exclud- 
ing the first row and the first column from M. Now, we 
have \M\ — \D\\a — B*D~ 1 B\ H. Since we are minimizing 
only over W\ while treating all other rows of W as con- 
stants, \D\ will not affect the minimization. In order that 
the required conditional expectations in the above mini- 
mization can be computed (even) numerically, we upper- 
bound the objective function using Jensen's Inequality. 
Thus we arrive at the following optimization problem: 



Efi log minE H ^(a — B* D 



B) 



(4) 



If t = 1, we can directly obtain W as given by equation ([5]) 
at the bottom of the next page, otherwise we proceed as fol- 
lows. We first evaluate the objective function above. Since 
B is in partitioned form, we partition D^ 1 accordingly, i.e., 
F G 

let D^ 1 = j k ' wnere F is (t — 1) x (t — 1) matrix, 

K is r x r matrix, and G — J* . With this, we get equation 
([6]). It can be observed that the expression a — B* D^ 1 B 
is quadratic in W\. Thus, using technique of 'completing 
the square,' one can evaluate the optimal W\ and is given 
by equation (0 El- Here, we need conditional expectations 

1 Notation used in Algorithm 1: For matrix A, A\ denotes the first 
row of A, A^' denotes its first column, An is its (1,1) element, and 
A is entire matrix A except the first column of it while Aj is entire 
matrix A except the first row of it. Thus, for example, (W*) 1 is the 
entire matrix W* except for first column of it; (W*) 1 = (Wj)*; and 
E Xi is part of first row of T, x except for its (1, 1) element. 

2 Matrix D can be proved to be invertible with probability 1 for 
any choice of W. 

3 It turns out that the matrix inverted in (0 is invertible if and only 
if is invertible. This can be easily fixed via spectral decomposition 
of Eg. Due to lack of space, we omit details of it here. 
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of F, GH (or H*J), and H*KH, which can be evaluated 
numerically through Monte-Carlo Simulations. 

Now, consider minimization over Wk, treating all other 
rows of W as constants. Similar to the case of W\, observe 
that only the k th row and the k th column of M depend 
on Wk- Thus, by interchanging the k th row of M with 
its first row and the k th column with the first column, we 
can obtain a matrix M' whose only first row and the first 
column depend on Wk- Hence, we can use the procedure 
described before to minimize \M'\ = \M\ over Wk- 

An iterative algorithm can now be set up as follows: 

1. Start with some initial choice for W. We typically 
use W — I for this purpose. 

2. Loop: k = 1 to t 

• Minimize over Wk using the procedure described be- 

fore. 

• Update W according to Wk found before and so also 

matrix M. 

3. Repeat Step 2 until the increase in the achievable rate 
is negligible. 

The algorithm produces a bounded (from below), mono- 
tonically decreasing sequence of upper-bounds on the ob- 
jective function with iteration steps and hence, it con- 
verges. 

Simulation 1: For simplicity we take the elements of H 
to be independent and identically distributed as 7V(0, 1). 
We quantize each element separately using a simple equal 
spacing level quantizer (quantization bins are of equal 
length except for the first and last bins that extend to 
infinity) and the spacing is determined using data from 
[7]. In Fig. 1(a), B denotes the number of feedback bits 
per element of matrix H . 

Fig. 1(a) plots the achievable rate for the 3x2 and 3x3 
MIMO FDPCs as a function of P. Comparing R with the 
no-interference upper-bound, it can be said that our algo- 
rithm finds a good choice for the inflation factor. Also, 
some important observations can be made regarding the 
high-SNR behavior of the achievable rate. These observa- 
tions have been formally stated and proved as Theorems 1 
and 2 later in this paper. 

Algorithm 2: In the previous algorithm, we minimize 
the upper-bound (obtained via Jensen's Inequality) on the 
objective function. The key idea of this second algorithm 



is to solve the Karush-Kuhn- Tucker conditions for the op- 
timization problem or in other words, to find a stationary 
point of the objective function. 

Thus, we need to solve an equation jjjyE H log|M | = 0, 
where M is the block-partitioned matrix in ([3]) as referred 
in the derivation of Algorithm 1. For simplicity, we con- 
sider in this derivation only the case of no CSIT since the 
results can be easily extended to the partial-CSIT case. 
We start with finding differential as follows: 

dE H \og\M\ = E H tr{M~ 1 dM} ••• see 0. 



E H tr { M 



dWE s W* + WY, s (dW)* (dW)E s H* 
HT, s (dW)* 

" dW 



= 2£J H tr|s s [ W* H* ] M" 1 ^ J. 
Thus we arrive at an equation -^Eu\og\M\ = : 



E H {AiW+AlH)H s = 0, where 



At 
A 2 



= M~ 



(8) 



It can be proved that without loss of generality we can con- 
sider a solution of the form W = -(EhAx^Eh^H) = 
—g{W) for the above equation^ even when |Ss| = 0. This 
fixed-point equation for W allows us to set up the following 
iterative algorithm. 

1. Start with some initial choice for W, i.e., W^-°\ 

2. At the n th iteration, set W^ = -g^W^-^) with 
required expectations computed numerically. 

3. Repeat Step [5] until the improvement in the achiev- 
able rate is negligible. 

Since our optimization problem is non-convex, the algo- 
rithm does not necessarily yield the optimum solution. 

Simulation 2: It is observed, rather surprisingly, that 
in most of the cases, both the algorithms yield Ws that 
achieve almost equal rate over the FDPC. Thus, rather 

4 Since the matrices involved in the optimization are all complex- 
valued, we basically need to convert the problem to an equivalent 
optimization problem involving only real-valued variables using the 
technique of [8]. However, due to lack of space, we consider in this 
derivation only the case of real-valued variables. It can be proved 
that even in the case of complex-valued variables the equation ((8} 
still holds. 

a Matrix A\ is invertible with probability 1. 



W = PE(H*{H(P + Q)H* + Y< z }- 1 H){l-QE(H*{H{P + Q)H* + V z )- 1 H)}-\ where £(•) = E H]A (-). (5) 
a-B*D- x B = 

S Xu + Wi£ s (Wi)* - {(E^ + WiY,s(W*) l )F{Y} Xi + W^W*) 1 )* + (S Xl + W^ S )H* J(Y} Xl + W^siW*) 1 )* + 
(E^ + W^siW^GH^x, + W^sY + (E Xl + W^H* KH(Z Xl + W^s)*}- (6) 

Wi = (e i Xi E(F)Wi'Es + ~£ Xl E(H* JW-^s + ^ l Xl E{GH)^ s + Z Xl E(H* KH)Z S ) X (£ s - Es(VF I )*^(F)VF I E s - 
Z S E{H* J)WiE s - £ s (Wi)*£(Gff)£s - Z S E(H* KH)^)- 1 , where E(-) = E H , A (-). (7) 
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than considering such cases, we include here some of the 
plots in which one algorithm does better than the other 
in terms of the achievable rate. Figs. 1(b) and 1(c) show 
some of these cases. However, it has not been possible to 
precisely characterize such cases. The main advantage of 
Algorithm 1 over the second is that Algorithm 1 is usually 
faster in terms of number of iterations required. 

It would be in order here to acknowledge a limitation of 
these algorithms. It is observed sometimes that for systems 
with f > r, these algorithms do not yield a good solution. 
One such example is seen in Fig. [2] in which it seems likely 
to us that some improvement in R is possible in a range of 
P = to 10 dB. It has been puzzling that even though these 
algorithms solve for two different things (one minimizes an 
upper-bound on the objective function while other solves 
for the stationary point of it), both of them fail at the same 
time and find nearly same solutions (in terms of R). 

Application to the lattice-based transmission strategies: 
A lattice-based transmission scheme for the MIMO FDPC 
with no CSIT has been proposed in [TO]. The problem of 
determination of inflation factor also appears in the design 
of such a scheme, for which the paper [TO] does not provide 
any satisfactory solution. Thus, our algorithms can be used 
for the design of lattice-based schemes as well. 

IV. High SNR Analysis 

In this section, we deal with the high-SNR behavior of 
the achievable rate and prove some important results. 

A. FDPC 

We begin with the high-SNR scaling law of the FDPC. 

Theorem 1: The rate achievable over the no (and hence 
partial) CSIT FDPC using DPC scales optimally in the 
high-SNR regime as min(r,£) log SNR if the ratio ^ is held 



constant as P — > oo. 

Proof: It is clear that the scaling factor of FDPC with 
partial or no CSIT is upper-bounded by min(r, t) log SNR. 
We, in fact, prove that it is equal to min(r, t) log SNR 
by proving that a lower-bound on the achievable rate can 
achieve the before-mentioned scaling factor. 

Let us consider a lower-bound obtained by using a par- 
ticular choice of W = I. 



R> E H log 



E H \og- 



+ (Ex + £*)#* 



Ex + Es (Ex + E S )£T 



> E H \og 





E z + tf(Ex 






Ex + E s 












Sz 




J 


Sxl |Ez- 


hHE 


xH*\ 



(9) 
(10) 



where the equality in @ follows due to row and column 
operations, whereas the inequality in (fTOjl follows because 
Ex + Es t £x within partial order. 

Since ^ is assumed to remain constant as P — > oo, the 
first factor ' x ' remains bounded as P — > oo. The sec- 

ond factor Eh log ^ Z+ ^^ H can be made to scale in the 
high-SNR regime as min(r, t) log SNR, even without any 
CSIT, by choosing an appropriate Ex- ■ 
The theorem above proves the achievability of the largest 
possible scaling factor over the partial or no CSIT MIMO 
FDPC and in this sense, the statement of the theorem is 
the strongest. 



6 It is relevant from the point of view of MIMO Gaussian BC 
to consider ^ remaining constant as P — + oo and hence the same 
assumption is made in earlier simulations as well. 




Fig. 1 

Numerical Results Using Two Algorithms, (a) Achievable Rates vs. P: Algorithm 1. (b) Comparison of Two Algorithms: 
2x2 FDPC. (c) Comparison of Two Algorithms: 3x2 FDPC. (No-S-ub denotes no-interference upper-bound.) 
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Fig. 2 

A Case in which Algorithms Do Not Work. 



The above theorem can be further strengthened in the 
case of FDPC with t < r. 

Theorem 2: For the FDPC with t < r, if the ratio 
% is held constant, the difference AR between the no- 
interference upper-bound and the rate achievable under 
no (and hence partial) CSIT tends to zero as P — > oo. 

Proof: Under the specific choice of W = I, we get 



AR = E H log • 
= E H log — 



\I + THZ X H*T*\ 



E H \og 



-TH{T lX + Y lS )H*T 



-log 
-log- 



P- 1 ! + E' Y H*T,Z 1 H\ 



p-^+m 



X 



■V S )H*Y.-?H\ 



log- 



x\ 



where E z = T^T*)" 1 , Y, x = PV X , E 5 



^'x + s 's 



PE' S , and the 



\I+BA\ 



matrix H*T, Z 1 H is invertible. 



third equality follows from the fact that |/+ AB| 
whenever products AB and BA are defined. 

For the case of t < r, 
Thus, we obtain limp^oo AR = 0. Therefore, the choice 
W = I is optimal at high SNR for FDPCs with t < r. ■ 
For t > r case, AR does not indeed to go to zero, except 
in some special cases. 

There is a nice intuitive explanation of why AR tends 
to zero when t <r. Consider the perfect-CSIT-optimal W, 
i.e., W = Y lX H*{HY> x H* + Y, z y 1 H. For t < r, it can be 
proved that W — ► I as P — > oo for any value of H . Hence, 
when t < r, in the limit of high SNR, knowledge of H is 
not required as far as determination of W is concerned, and 
therefore, we get AR = in limit. However, for the case 
of t > r, we get W -> ?? X H*{HYJ X H*)- X H (as P -> oo) 
which depends on the value of H . Hence, for FDPCs with 
t > r, even in the limit, CSIT is required for determination 
of W, and thus AR does not to go to zero in general. 

An important implication of the above theorem is the 
following proposition. 

Proposition 2: For FDPCs with t <r, Costa's choice of 
auxiliary RV, namely, U = X + WS is optimal at high SNR 
even under partial or no CSIT. 

Thus, for the first time the optimality of Costa's choice of 
auxiliary RV is shown under partial CSIT (even though it 
is only in a special case). 

B. Gaussian MIMO BC 

Theorem [1] proved in the previous subsection has an 
important consequence on the achievable sum-rate scal- 



ing factor of the Gaussian MIMO BC with partial or no 
CSIT. It is now well-established that given a fixed level of 
partial CSIT beamforming-based multi-user transmission 
strategies can not achieve sum-rate scaling over the MIMO 
BC [TT]. And thus, till now, only the single- user strategy 
of time-division multiple access (TDMA) was known to 
achieve to the sum-rate scaling. However, as the following 
proposition suggests, a DPC-based multi-user transmission 
strategy can also achieve the same. 

Proposition 3: For the Gaussian MIMO BC with t trans- 
mit antennas and users with r receive antenna each, a high- 
SNR sum-rate scaling factor of min(i, r) log SNR can be 
achieved even without any CSIT if DPC is used. 

Proof: If DPC is used at the transmitter, for the user 
encoded last, unlike other users, entire interference can be 
(potentially) canceled by DPC. Hence, as per Theorem [T] 
the achievable rate for the last user can be made to scale 
in the high-SNR regime as min(i,r)logSNR. ■ 
Though TDMA can achieve the same scaling factor, this 
result is interesting because it is for the first time that a 
multi-user strategy is shown to achieve a non-zero high- 
SNR sum-rate scaling factor over the Gaussian MIMO BC 
with partial or no CSIT. Also, this proposition is in accor- 
dance with the main result and the conjecture of [12j. 

V. Conclusion and Future Scope 

The paper proposes two good algorithmic solutions for 
the problem of determination of inflation factor. Apart 
from this, some important results are proved analytically 
in the high SNR regime. Our algorithmic solutions are 
found to work well, except in some cases as mentioned 
before. More efforts are required to better the performance 
in these cases. 
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